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Abstract. In order to study how well a finite group might be generated by re- 
peated random multiplications, P. Diaconis suggested the following urn model. An 
urn contains some balls labeled by elements which generate a group G. Two are 
drawn at random with replacement and a ball labeled with the group product (in 
the order they were picked) is added to the urn. We give a proof of his conjec- 
ture that the limiting fraction of balls labeled by each group element almost surely 
approaches — 



1. Introduction 

In order to study how well a finite group might be generated by repeated random 
multiplications, P. Diaconis suggested the following urn model. An urn contains 
some balls labeled by elements which generate a group G. Two are drawn at random 
with replacement and a ball labeled with the group product (in the order they were 
picked) is added to the urn. He conjectured that the limiting fraction of balls labeled 
by each group element approaches ^ with probability 1. 

This problem arose from work of Diaconis and S. Rees who were studying a group 
theoretic algorithm called Meat Axe. (For further reading see This is a widely 
used tool for decomposing representations of a finite group G over a finite field F. 
To begin the MeatAxe, a random element of the group algebra FG must be chosen. 
In practice, this is done by taking a sum of a few products of generators such as 
B + AB + BA + BAB 2 + A + A and "hoping for the best." 

Diaconis and Rees began the study of more careful algorithms which would provably 
converge to a random element of FG. One proposal was this: from x € FG, go to 
sx, s~ l x, or x + as, where s is uniformly chosen from a generating set of G and a 
is uniformly chosen from F. The problem studied here arose as a sub-problem in 
analyzing this algorithm; it turns out to be challenging even for a group with two 
elements. 
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Note that, if G = Z/2Z = {0, 1} and the urn contains a large fraction of 0's, then the 
probability of adding another is also large, and it may seem that the preponderance 
of 0's continues. On the other hand, a simple computation shows that the expectation 
is that the fraction of l's moves toward 1/2. 

In this paper we use an elementary method to prove the conjecture of Diaconis. 
D. Siegmund and B. Yakir [2] have obtained these results independently as an appli- 
cation of an almost supermartingale convergence theorem, and another proof based 
on large deviations has been given by A. Shwartz and A. Weiss jlj. We remark that 
the rate of convergence of this procedure is open at this writing. 

The work of EZ has been supported in part by an Alfred P. Sloan Foundation fellow- 
ship. EZ also thanks the School of Mathematics, Statistics, and Computer Science 
at Victoria University, Wellington, where some of this work took place. The authors 
thank P. Diaconis for introducing us to this problem and for helpful conversations 
along the way. We also thank Julie Landau whose support was essential to this work. 



2. Notation and Outline 

Let G = {g\, . . . ,gd} be a finite group. The state s of an urn is described by the 
number of balls with each of the d possible labels; thus we write 

s = (m, ...,n d ) 

where rtj = n gi is the number of balls labeled We measure time by the total 
number of balls: 

d 

t = 5^ni(t); 

i=i 

in particular the starting time of the process is a positive integer determined by the 
initial configuration. To emphasize that the state evolves we will write s = s(t) and 
rtj = rii(t). Let 

»(-(*)) =P»(«(*)) 

be the fraction of balls labeled g% in state s(t); we will also refer to pi(s) as the density 
of such balls. Clearly, ^iP*( s ) = 1- We often omit the explicit dependence on s and 
t and write pi for pi(s(t)) and rtj for rii(t). 

Our main result is the following. 

Corollary El A ssume the urn initially contains a set of balls whose labels generate 
the finite group G. Then with probability one, for each i, 

Pi — ► l/d 

as t — » oo. 

In other words, in the limit, the urn contains an equal fraction of all group elements. 
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We now describe some of the notation and strategy used in the proof. Our process 
can be described as a random walk on the non-negative integer lattice in R rf , with 
the (state-dependent) probability of adding one to the jth coordinate given by 

(!) *rj(*) := ^2pg(s)p g -i g .(s). 

g&G 

That is, at state s = (m, . . . , rid) we have 

(2) n j (t+l) = hf\ + 1 W -P-^)) 

\rij{t) otherwise. 

Observe, of course, that the functions rij do not evolve independently, as they are 
constrained by X^ n «(^) = 

Now, let < a < 1/d, and define 

= | {xx, . . . Xd) € R^ | Xj > a(xi + • • • + Xd) for each 1 < j < d| ; 
thus s(t) £ Sq, if the balls of each label have density at least a. 

We will deduce our main result by showing that as t grows, Prob{s(i) G X/3} almost 
surely approaches 1 for each (5 < 1/d. The argument has two main ingredients which 
are contained in Lemma [2] and Theorem |2j Lemma shows that with probability 
exponentially close to 1, if the distribution of labels is not uniform then the lowest 
density in the urn increases by a constant factor after evolving for a fixed fraction of 
the elapsed time. By iterating, we can bring the lowest density arbitrarily close to 
1/d. To make this result effective, however, this lowest density must be nonzero, i.e., 
every element of G must have a representative in the urn. This is clearly true for 
G = Z/2Z, so LemmaHJis sufficient for this case. However we require the additional 
arguments of Sectional which use the result in the Z/2Z case, to show that an urn 
containing only a generating set will eventually (with probability 1) contain each 
element of G. 

3. Special case: the urn contains every element 

We begin with a bound on the transition probabilities 7Tj(s) for states s G E a . 
Lemma 1. If s € T, a , then for each j, the transition probability satisfies 

TTj(s) > 2a — da 2 . 

Proof. We prove more generally that if Xi, yi > a for i = 1, . . . ,d and ^ X{ = Yl y, = 1 
then yZ x iVi > 2a — da 2 . The lemma follows by setting Xi = p„. and yi = x> -i_ in 

CD- 
Writing Xi = a + x\ and yi = a + y-, with x' v y[ > 0, Ya=i x 'i = Yli=i v'i = 1 ~ da, we 
have 

Xiyi = ^][a 2 + ol{x[ + y[) + x\y'^\ = da 2 + a{2 — 2da) + x-y^ > 2a — da 2 . 
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□ 



The main lemma, which follows, says that (for appropriate choices of r, 7 > 1) if the 
state s of the urn is in £ Q at time T, then at time rT the chance that the state 
is in S 7Cf is exponentially close to 1. We want to iterate this argument until we can 
conclude with high probability that the state is in E^; for the iteration it is important 
that 7 and the coefficient of the exponential term do not depend on a. 

Lemma 2. Let < a < 1/d, and fix r in the range 1 < r < 2 — ad. There exist 
constants 7 > 1 and C > 1, depending only on r, such that for all T: 

(3) if s{T) G T, a then Prob{s(rT) G S 7Q } > 1 - C~ aT . 

Proof. We examine the walk independently in each coordinate direction. The main 
idea is that in a given coordinate, we can compare the process (J2J) over a certain 
period of time to a random walk with the constant transition probability given by 
the bound in Lemma ^ As this is a simple random walk, it is easy to estimate its 
behavior. The choice of time period is somewhat delicate, however: if it's too short, 
there isn't enough time to progress toward the mean (with high probability); on the 
other hand the constant lower bound on the transition probability is only valid for a 
limited time, since any pi could eventually become arbitrarily close to zero. 

Suppose the urn is in state s(T) = (m(T), . . . ,n^(T)) G E a at some time T. Note 
first that, for T <t < rT, the value of Pi{t) can never fall below 

(4) m(T)/rT >a/r. 

As this is true for every i, it follows from Lemma Q that 

(5) ^( S (t))>2(^)-d(^) 2 
for T < t < rT. 

Now consider a random walk Wi on the integers which begins at time T at the value 
n,(T), and evolves according to the probabilities 

' Wi(t) + 1 with probability 2a /r — da 2 /r 2 
Wi(t) otherwise. 



Wi(t + 1) 



By comparing the evolution (J2J) with Wi we see from (JHJ that for T < t < rT and any 

x, 

Prob{ni(t) >x}> Pi:ob{Wi(t) > x}. 
It follows that for any 7, 

(6) Probjpj (rT) > 7a} = Prob{ l —— - > ja} 

> Piob{Wi(rT) > -/arT} 

= Prob{A"i((r - 1)T) > jarT - n^T)} 

> Prob{X,((r - 1)T) > ( 7 r - 1)qT}, 
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where Xi(t) := Wi(T + t) —m(T) is the (space and time) translate of Wi which starts 
at at time 0. 

This last probability is easy to estimate, as is a sum of independent identically 
distributed Bernoulli trials. Specifically, the random variable Xi(t + 1) — Xi(t) takes 
values and 1 and has mean 

a 7 /a\ 2 

(7) „:=2--d[-) . 

Then since da < 2 — r and 1 < r < 2 we have 

fx 2 a 3r - 2 

(8) - = - - d-z > 5— > 1. 

a r r r L 



(■yr—l)a 

_->!+ 

(but not a) so that 



It follows that 11111^!+ (r-i)p = % ^ ^ ence we can 7 > 1 depending on r 

(7T - l)a 

The Chernoff bound [lj estimates the probability after time t that Xj is above a 
fraction (5 of expected value: 

(9) Prob{Xi(t) > 5 fit} > 1 - exp(-(l - 6) 2 fJ,t/2). 

Applying © at time t = (r — 1)T and with the above 5, we obtain 

Prob{Xi((r - 1)T) > ( 7 r - l)aT} > 1 - exp {-(1 - 6) 2 /j,(r - l)T/2) 

> l-exp{-(l-<5) 2 (r-l)aT/2} 

= l-A~ aT 

where the second inequality follows from (jHJ) and where 

A = exp{^(l-<y) 2 (r -1)}>1- 

In view of ©, this gives the desired bound for each Xj. Since the identical bound 
holds for each i, we conclud.6 that the chance that the state s(vT^ is inside S^q, is at 
least 1 - dA~ aT = 1 - C~ aT . □ 

Note once again that Lemma El provides no information if a = 0. Theorem ^ below 
assumes a is positive, and Theorem |2] in the next section shows that this assumption 
is eventually valid when the urn begins with a set of generators for G. We need one 
more preparatory lemma before proving Theorem ^ 

Lemma 3. Suppose there is a time at which rii > 1 for each i. Then for each N E N, 
there is almost surely a time at which ni> N for each i. 

Proof. We proceed by induction on N. Assume each rii{T) > N > 1 at some time T. 
Thus for every t > T we have pi > N/t, so by Lemma Q 

, , 2N dN 2 AT/ 
TTjOO > p- > N/t 
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for each j, where the last inequality follows since necessarily T > dN. That is, the 
chance of increasing rij at time t is at least N/t for each t > T. As ^2(N/t) diverges, 
with probability one rij will eventually increase. This holds for all j so eventually 
each rij will be at least N + 1. □ 

Theorem 1. Suppose the urn is in a state with each m > 1. Then with probability 
one, for each i, 

Pi — > 1/d 

as t — > oo. 

Corollary 1. Let G = Z/2Z = {0,1}. If the urn initially contains at least one ball 
labeled 1, then the densities of the two elements almost surely approach 1/2. 

Proof of Corollary^ We are given that n\ > 1, and after possibly waiting for one 
step we will have no > 1 as well. Thus Theorem Q applies. □ 

Proof of Theorem^ Fix (3 < 1/d and let e > 0. It suffices to show that t' can be 
chosen so that Prob{s(i) G S^} > 1 — e for all t > t' . 

Choose f3' between (3 and 1/d. We will show that with high probability, the state 
evolves into T,p> and then stays inside S^. For both steps we will apply Lemma [21 
iteratively. 

Let r = 2 — d(3' . Note that for any a < f3' , this choice of r satisfies the hypothesis of 
Lemma |2 thus we have a C and 7 depending on r (but not on a) such that for all 
a < f3' and for all T, © holds. 

For any a < /?', then, we may iterate Lemma HI with the same value of r. Thus for 
s(T) G S a , we have a bound on the chance that s{r 3 T) is not in S 7 j Q : 

j'-i 

(10) Prob{ S (HT) £ 7 U < C~ aT ^ 

i=0 

as long as 7 J ~ 1 a < As j tends to infinity, the above sum converges; indeed we 
can bound the sum independently of j by a function / of aT which decreases to as 
aT tends to infinity. 

Choose JVgNso large that f(N) < |. By assumption, there is a time at which each 
rtj > 1; therefore by Lemma we may choose T such that with probability at least 
1 — |, each ni(T) > N. Let a = N/T; then by definition 

Prob{s(T) G £ Q (T)} > 1 - |, 

and so by (fT0|) and our choice of N, 

(11) Prob{s(r fc T) G S^} > 1 - | 
where A; is the smallest positive integer with j k a > j3' . 
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Let t' = r k T. It remains to show that the probability of escaping E^ at any time t > v 
can be made arbitrarily small. For this we apply Lemma [2] with the role of a played 
by /?' and the role of r played by any r' > 1 chosen to be less than min{/3'//3, 2 — d(3'}. 
The lemma provides a C > 1 and a 7' > 1, and as E^/pi C Ep>, we may ignore 7' and 
write, for all t, 

If s(t) G Ep> then Prob{s(r't) G > 1 - (C'Y^'K 
Iterating as before, it follows that if j3't is sufficiently large, we can ensure that 
(12) s(t) G Ep> implies Prob{s((r') fc t) G for all k > 0} > 1 - ^. 

Note that if s((r') k t') G Eg/ for all > then s(t) G E^ for all t > t', because 
P'/r' > (3. 

So, to complete the proof, we must rechoose our initial value of T if necessary so that 
t' = r k T is large enough for (|12|) to hold with t = t' . Then by (|11|) and (|12|) we have 

Prob{s(t) G E^} > 1 — e 

for all t > t'. □ 

4. General case: the urn contains a generating set 

The goal of this section is to prove Theorem from which the main result (Corollary 
121 follows. 

Theorem 2. Suppose the urn contains a set of balls whose labels generate the group 
G 0/ order d. Eventually, with probability one, the urn will contain a ball labeled by 
each group element. 

Corollary 2. Suppose the urn initially contains a set of balls whose labels generate 
G. Then with probability one, for each i, 

Pi — ► l/d 

as t — > 00. 

Proof of Corollary^ This follows immediately from Theorems Q and |21 □ 

To prove the theorem we will need three lemmas, which are stated next but proved 
after the theorem. In outline, the argument begins at a time T (to be chosen big 
enough) with a set S of elements, each with density bigger than a small constant. 
Lemma |1] shows that with high probability, at a later moment all of the elements 
of the subgroup H generated by S will have density bigger than a smaller constant. 
Lemma El produces a later time when an element outside H will have a comparable 
density, and LemmaElshows that at that time, with high probability the density of the 
elements of H will not have fallen very much. In this way, a larger set with nonzero 
density is produced. By iterating we conclude that eventually the whole group will 
have positive density. 



8 AARON ABRAMS, HENRY LANDAU, ZEPH LANDAU, JAMES POMMERSHEIM, ERIC ZASLOW 



For a subset S C G let n s (t) = J2 g ^s n g( t ) and Ps(t) = YlgesPgi' 1 ) = n s(t)/t. Also 
let G \ S denote the complement of S in G, and let (S) denote the subgroup of G 
generated by S. 

Lemma 4. Let < v < 1. There exists a constant c > 1, depending on v, such 
that for all T, if p g (T) > v for all g in a subset S of G, then with probability at least 

P g (Ti)>ld[^) 2 forallge(S), 

where T x = 2 d T. 

Lemma 5. Let H ^ G be a subgroup of G, and let T be any time. With probability 
1 there exists a time t > T for which Paxuif) — \- 

Lemma 6. For v sufficiently small, there exists a constant C depending on v such 
that for all T and T' , if Ph(T) > v for all h in a subgroup H and Pg\h(^) < dv for 
all T < t < T' , then with probability at least 1 — C~ T , 

Ph(T') > — for all he H. 
3 

Proof of Theorem® Let vq < 1/d be small enough that Lemma|H]holds and also that 

u' = 16 (j§) 2 < 52, i.e. v' d < \. Fix T and define 

So = {9 G G : Pg (T) > u }. 

Since the p g (T) sum to 1 and vq < So is nonempty. By Lemma 0] we have that 
with probability at least 1 — c~ T , 

Ph(Ti) > v'o for all h € H = (S ), 

where T\ = 2 d T. If Hq ^ G, then Lemma \5\ applied to Hq guarantees that PG\H (t) ^ 
1/4 > i/' d at some time t; let T[ > T\ be the first such time. Then there must be 
a g* Hq with p g *(T{) > z/q. Now Lemma El implies that with probability at least 
1 - C~ T , p h {T[) > 2^/3 =: i/i for all h G Hq. Let 

S, = {g G G : p g {T[) > ^}; 

since z/q > u\, S\ includes g* as well as all of Hq. Hence H% = (Si) is strictly larger 
than Hq, and we may repeat the argument. After some number k < d of iterations, 
Hk must equal G as desired. 

To complete the proof, note that once the Vj have been fixed, the exceptional prob- 
ability in this argument is on the order of c~ T for some c > 1, hence it can be made 
arbitrarily small by choosing a sufficiently large initial time T. □ 

Proof of Lemma^ In running the evolution (J2J) until time 2T, each pi > v/2 for 
gi € S as in ©. Hence the probability at each step of adding any product gigj to 
the urn where g%,gj € S is at least (§) 2 - By choosing 6 = | in Chernoff's bound, the 
number of occurrences of g^gj in the urn at time 2T exceeds ^T(^) 2 (or, equivalently, 
p gigj (2T) > jjt) with probability at least 1 — e~s T ^) 2 : . 
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Since every element of (S) can be expressed as a product of at most \(S}\ < \G\ = 
d elements of S, we iterate this argument d times to obtain all elements of {S). 
Specifically, at time 2 d T each element of H has density at least 



with probability at least 1 — c , where c > 1 is a suitable constant depending on v 
and d. □ 

Proof of Lemma\^ We will couple the behavior of nn(t) and nQ\jj(t) under the evo- 
lution (J2J) with the behavior of the quantities no(t) and ni(t) associated to a second 
urn running the same evolution on the group Z/2Z, beginning with n#(T) balls la- 
beled and n G \ H (T) balls labeled 1. Picking h G H and k € G \ H from the first 
urn correspond to picking and 1 from the second urn, respectively. Since hk and 
kh are in G \ H we see that at times when nn = no and hg\h = n ii the probability 
of increasing uq\h is at least as great as that of increasing n\. Coupling the urns at 
these times shows that nQ\ff(t) > ni(t) for all t. However, by Corollary Q n\{t)/t 
approaches 1/2 as t increases, hence riQ\ H {t)/t = Pg\h{^) eventually exceeds 1/4. □ 

Proof of Lemma\Q The reasoning is similar to that in Theorem ^ We begin by 
showing for sufficiently small v that if at time T, Ph(T) > v for all h £ H and 



PG\n(t) < vd for T <t < 3T/2, then with probability 1 - C~ T , p h (3T/2) > v for 
all h £ H. A lower bound for the probability of adding h E H to the urn at time 
T < t < ^ is 



where the first term is the probability of picking both elements from H and the second 
term is the lower bound given by Lemma ^ applied with r = 3/2 to the group H of 
size do with normalized densities at least v/pH(t). By assumption pn(t) is at least 
1 — du, so (JT3)) is bounded below by 



(13) 





Thus by Chernoff 's bound, with probability at least 

s 2 ,,T 

1 - e --v- , 



there will be at least (1 — S)fi^ balls labeled h € H added to the urn between times 
T and For small enough v we can choose 5 small enough so that 

(1 - 6)fi > v, 



and thus with probability 1 — C T for some Cq > 1, we have Ph(^f) > v for all h € H. 
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By choosing j so that < T\ < (|) J+1 T, we can iterate this argument j times 

to conclude that ph ((§) J T) > v for all h € H with probability at least 

i=0 

where (as in the proof of Theorem ^) C can be chosen independently of j. The 
argument is completed by noting that again as in (J3J), ph ((|) J T) > v implies Ph{t) > 
\v for all (§)'T < t < (f y +1 T. □ 
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